Cross-Modal Similarity Learning via Pairs, Preferences, and Active Supervision
نویسندگان
چکیده
We present a probabilistic framework for learning pairwise similarities between objects belonging to different modalities, such as drugs and proteins, or text and images. Our framework is based on learning a binary code based representation for objects in each modality, and has the following key properties: (i) it can leverage both pairwise as well as easy-to-obtain relative preference based cross-modal constraints, (ii) the probabilistic framework naturally allows querying for the most useful/informative constraints, facilitating an active learning setting (existing methods for cross-modal similarity learning do not have such a mechanism), and (iii) the binary code length is learned from the data. We demonstrate the effectiveness of the proposed approach on two problems that require computing pairwise similarities between cross-modal object pairs: cross-modal link prediction in bipartite graphs, and hashing based cross-modal similarity search.
منابع مشابه
Cross-Modal Supervision for Learning Active Speaker Detection in Video
In this paper, we show how to use audio to supervise the learning of active speaker detection in video. Voice Activity Detection (VAD) guides the learning of the vision-based classifier in a weakly supervised manner. The classifier uses spatio-temporal features to encode upper body motion facial expressions and gesticulations associated with speaking. We further improve a generic model for acti...
متن کاملModeling Text with Graph Convolutional Network for Cross-Modal Information Retrieval
Cross-modal information retrieval aims to find heterogeneous data of various modalities from a given query of one modality. The main challenge is to map different modalities into a common semantic space, in which distance between concepts in different modalities can be well modeled. For crossmodal information retrieval between images and texts, existing work mostly uses off-the-shelf Convolutio...
متن کاملProbabilistic Semi-Supervised Multi-Modal Hashing
In this paper, we propose a non-parametric Bayesian framework for multi-modal hash learning that takes into account the distance supervision (similarity/dissimilarity constraints). Our model embeds data of arbitrary modalities into a single latent binary feature with the ability to learn the dimensionality of the binary feature using the data itself. Given supervisory information (labeled simil...
متن کاملMulti-Modal Distance Metric Learning
Multi-modal data is dramatically increasing with the fast growth of social media. Learning a good distance measure for data with multiple modalities is of vital importance for many applications, including retrieval, clustering, classification and recommendation. In this paper, we propose an effective and scalable multi-modal distance metric learning framework. Based on the multi-wing harmonium ...
متن کاملAssessment of learning style based on VARK model among the students of Qom University of Medical Sciences
Introduction: Learning is a dominant phenomenon in human life. Learners are different from each other in terms of attitudes and cognitive styles which effect on the learning of people. In this connection, VARK learning style assess the students base their individual abilities and method for obtaining much information from environment in dimensions of visual, aural, read/write, and kinesthetic. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015